Small sample and selection bias effects in calibration under latent factor regression models

نویسنده

  • Rolf Sundberg
چکیده

We study bias of predictors when a multivariate calibration procedure has been applied to relate a scalar y (concentration of an analyte, say) to a vector x (spectral intensities, say). The model for data is assumed to be of latent factor regression type, with multiple regression models and errors-in-variables models as special cases. The calibration procedures explicitly studied are OLSR, PLSR and PCR. When y has been more or less systematically selected in the calibration in order to achieve increased variation (overdispersion), a practical device to increase precision, this leads to biased coefficients in the predictor, possible to see when observed y is regressed on predicted ŷ(x) for a separate validation set. Another bias effect is a sample size effect, increasing with reduced calibration sample size and with increasing dimension of x (absent when x is univariate). Formulae are given for these bias effects, both separately and in combination, and the formulae are illustrated and compared with simulation results. As a qualitative example, PLSR and PCR are less sensitive than OLSR to small samples, but equally sensitive to selection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Small sample bias and selection bias effects in multivariate calibration, exemplified for OLS and PLS regressions

In multivariate calibration by for example ordinary least squares (OLS) multiple regression or by partial least squares regression (PLSR) the predictor ŷ(x) is perfect for the calibration sample itself, in the sense that the regression of observed y on predicted ŷ(x) is y = ŷ(x). Plots of y against ŷ(x) are much used to illustrate how good the calibration is and how well prediction works. Usual...

متن کامل

Spatial Regression in the Presence of Misaligned data

In this paper, four approaches are presented to the problem of fitting a linear regression model in the presence of spatially misaligned data. These approaches are plug-in method‎, ‎simulation‎, ‎regression calibration and maximum likelihood‎. In the first two approaches‎, ‎with modeling the correlation between the explanatory variable, prediction of explanatory variable is determined at sites...

متن کامل

Detection of and Adjustment for Multiple Unmeasured Confounding Variables in Logistic Regression by Bayesian Structural Equation Modeling

Aim: To compare the bias magnitude between logistic regression and Bayesian structural equation modeling (SEM) in a small sample with strong unmeasured confounding from two correlated latent variables. Study Design: Statistical analysis of artificial data. Methodology: Artificial binary data with above characteristics were generated and analyzed by logistic regression and Bayesian SEM over a pl...

متن کامل

THE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)

Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes.  Small area estimation is needed  in obtaining information on a small area, such as sub-district or village.  Generally, in some cases, small area estimation uses parametric modeling.  But in fact, a lot of models have no linear relationship between the small area average and the covariat...

متن کامل

به‌کارگیری متغیرهای پنهان در مدل رگرسیون لجستیک برای حذف اثر هم‌خطی چندگانه در تحلیل برخی عوامل مرتبط با سرطان پستان

Background and Objectives: Logistic regression is one of the most widely used generalized linear models for analysis of the relationships between one or more explanatory variables and a categorical response. Strong correlations among explanatory variables (multicollinearity) reduce the efficiency of model to a considerable degree. In this study we used latent variables to reduce the effects of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007